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DETAILED ACTION 

Continued Examination Under 37 CFR 1.114 

A request for continued examination under 37 CFR 1.114, including the fee set 
forth in 37 CFR 1 .1 7(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1 .17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 
1/25/2008 has been entered. Claims 2 and 10 have been cancelled. Claims 
1,3,9,11,17,18,35,52 and 53 have been amended. Accordingly, claims 1-6,9-14, 17-27, 
31 , 35-44, 48, 52 and 53 are pending in this office action. 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1 , 148 
USPQ 459 (1966), that are applied for establishing a background for determining 
obviousness under 35 U.S.C. 103(a) are summarized as follows: 

1 . Determining the scope and contents of the prior art. 

2. Ascertaining the differences between the prior art and the claims at issue. 

3. Resolving the level of ordinary skill in the pertinent art. 

4. Considering objective evidence present in the application indicating 
obviousness or nonobviousness. 
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Claims 1-6, 9-14, 17-23 ,35-40, 52,53 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Wo 03060766 (hereinafter Lind) (art of record) in view of US 
6560597 (hereinafter Dhil). 

As for claim 1 Lind discloses: a scoring module determining a score which is 
assigned to at least one concept that has been extracted from a plurality of 
electronically-stored documents (See page 17 lines 20-24 note: definition of document 
corpus) wherein the score is calculated as a function of a summation of a frequency of 
occurrence of the at least one concept within at least one such document, a concept 
weight, a structural weight, and a corpus weight, forming the score assigned to the at 
one concept as a normalized score vector for each such document, and determining a 
similarity between the normalized score vector for each such document as an inner 
product of each normalized score vector (See page 30 line 30- page 31 line 1 ). 

; (See page 7 lines 20-24) a clustering module forming clusters of the 
documents comprising a selection sub module selecting a set of candidate seed 
documents from the plurality of documents; a seed document identification submodule 
identifying a set of seed documents by applying the similarity to each such candidate 
seed document and selecting those candidate seed documents that are sufficiently 
unique from other candidate seed documents as the seed documents; a non-seed 
document identification submodule identifying a plurality of non-seed documents; a 
comparison submodule determining the similarity between each non-seed document 
and a center of each cluster; and a clustering submodule grouping each such non-seed 
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document into a cluster with a best fit, subject to a minimum fit See page 28 line 8-16 
note representative= seed). by evaluating the score for the at least one concept of each 
document for a best to the clusters and assigning each document to the cluster with the 
best fit; and (See page 1 9 lines 4-1 0). While Lind does not differ substantially from the 
claimed invention the disclosure of a threshold module determining the similarity 
between each of the documents grouped into each cluster based on the center of the 
cluster and the scores assigned to each of the at least one concepts in that document 
dynamically determining a threshold for each cluster as a function of the similarity 
between each of the documents, and identifying and reassigning each of the documents 
having the similarity falling outside the threshold are not necessarily explicit. Dhill 
however does disclose a threshold module determining the similarity between each of 
the documents grouped into each cluster based on the center of the cluster and the 
scores assigned to each of the at least one concepts in that document dynamically 
determining a threshold for each cluster as a function of the similarity between each of 
the documents, (See column 3 lines 55-60 and column 5 line 55- column 6 line 5) and 
identifying and reassigning each of the documents having the similarity falling outside 
the threshold (See column 3 lines 60-65). It would have been obvious to an artisan of 
ordinary skill in the pertinent at the time the invention was made to have incorporated 
the teaching of Dhil into the system of Lind. The modification would have been obvious 
because the two references are concerned with the solution to problem of efficient 
document scoring and clustering, therefore there is an implicit motivation to combine 
these references. In other words, the ordinary skilled artisan, during his/her quest for a 
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solution to the cited problem, would look to the cited references at the time the invention 
was made. Consequently, the ordinary skilled artisan, would have been motivated to 
combine the cited references since Dhil's teaching would enable Lind's users to 
reclassify documents based on the center of the cluster.. 

As for claim 3 the rejection of claim 1 is incorporated, and further Lind discloses: 
a compression module compressing the score through logarithmic compression (See 
page 17 line 30-34). 

As for claim 4 the rejection of claim 1 is incorporated, and further Lind discloses: 
the scoring module calculating the concept weight as a function of a number of terms 
comprising the at least one concept (See page 21 lines 25-28). 

As for claim 5 the rejection of claim 1 is incorporated, and further Lind discloses: 
the scoring module calculating the structural weight as a function of a location of the at 
least one concept within the at least one such document (See page 1 8 lines 1 0-1 4). 

As for claim 6 the rejection of claim 1 is incorporated, and further Lind discloses: 
the scoring module calculating the corpus weight as a function of a reference count of 
the at least one concept over the plurality of documents (See page 18 lines 19- 21 note: 
this is an inverse weight of the reference count). 
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Claims 9, 11-14 are method claims corresponding to system claims 1, 3-6 
respectively, and are thus rejected for the reasons set forth in the rejection of claims 1 , 
3-6. 

Claim 17 is rejected for the same reasons as claim 9. 

As for claim 18 Lind discloses: a scoring module scoring a document in an 
electronically-stored document set comprising: a frequency module determining a 
frequency of occurrence of at least one concept within a document (See page 18 lines 
1-3); and a concept weight module analyzing a concept weight reflecting a specificity of 
meaning for the at least one concept within the document (See page 25 lines 27-30 
note: rtc(t,c) is a value based on meaning); a structural weight module analyzing a 
structural weight reflecting a degree of significance based on structural location within 
the document for the at least one concept (See page 1 8 lines 8-1 3), a corpus weight 
module analyzing a corpus weight inversely weighing a reference count of occurrences 
for the at least one concept within the document (See page 18 lines 19- 21 note: this is 
an inverse weight of the reference count); and a scoring evaluation module evaluating a 
score to be associated with the at least one concept as a function of the frequency, 
concept weight, structural weight, and corpus weight; (See page 21 24-27) and 

A vector module forming the score assigned to the at least one concept as a 
normalized score vector for each such document in the electronically-stored document 
set, and a determination module determining a similarity between the normalized score 
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vector for each such document as an inner product of each normalized score vector 
(See page 30 line 30- page 31 line 1). A clustering module grouping the documents by 
the score into a plurality of clusters comprising; a selection submodule evaluating a set 
of candidate seed documents selected from the electronically-stored document set; a 
cluster seed submodule identifying seed documents by applying the similarity to each 
such candidate seed document and selecting those candidate seed documents that are 
sufficiently unique from other candidate seed documents as the seed documents; an 
identification submodule identifying a plurality of non-seed documents; a comparison 
submodule determining the similarity between each non-seed document and a center of 
each cluster; and a clustering submodule assigning each non-seed document to the 
cluster with the best fit, subject to a minimum fit (See page 28 line 8-16 note 
representative= seed, (See column 5 lines 35-42). 

While Lind does not differ substantially from the claimed invention the disclosure 
of a threshold module relocating outlier documents, comprising determining the 
similarity between each of the documents groups into each cluster based on the center 
of the cluster and the scores assigned to each of the at least one concepts in that 
document , dynamically determining a threshold for each cluster as a function of the 
similarity between each of the documents, and identifying and reassigning each of the 
documents with the similaritiy falling outside the threshold are not necessarily explicit. 
Dhill however does disclose a threshold module determining the similarity between eac 
of the documents grouped into each cluster based on the center of the cluster and the 
scores assigned to each of the at least one concepts in that document dynamically 



Application/Control Number: 10/626,984 Page 8 

Art Unit: 2166 

determining a threshold for each cluster as a function of the similarity between each of 
the documents (See column 3 lines 55-60 and column 5 line 55- column 6 line 5); and 
identifying and reassigning each of the documents having the similarity falling outside 
the threshold (See column 3 lines 60-65). It would have been obvious to an artisan of 
ordinary skill in the pertinent at the time the invention was made to have incorporated 
the teaching of Dhil into the system of Lind.The modification would have been obvious 
because the two references are concerned with the solution to problem of efficient 
document scoring and clustering, therefore there is an implicit motivation to combine 
these references. In other words, the ordinary skilled artisan, during his/her quest for a 
solution to the cited problem, would look to the cited references at the time the invention 
was made. Consequently, the ordinary skilled artisan, would have been motivated to 
combine the cited references since Dhil's teaching would enable Lind's users to 
reclassify documents based on the center of the cluster.. 

As for claim 19 the rejection of claim 18 is incorporated and further Lind 
discloses: the scoring module evaluating the scoire in accordance with the formula Si = 
X fij x cwij x swij x rwij where si comprises the score, fij comprises the frequency, 
0<cwij <1 comprises the concept weight, o <swij <1 comprises the structural weight, 
and 0 < rwij < 1 comprises the corpus weight for occurrence j of concept I (See page 23 
lines 1-4). 



Application/Control Number: 10/626,984 Page 9 

Art Unit: 2166 

As for claim 20, the rejection of claim 19 is incorporated and further Lindh 
discloses: the concept weight module evaluating the concept weight in accordance with 
the formula: 
Cwij = 0.25 + (0.25 x tij), 1< tij < 3 

0.25 + (0.25 x [7-tij]) 4<tij<6 

0.25, tij > 7 (See page 17 lines 30-34) 

As for claim 21 , the rejection of claim 19 is incorporated, and further Lindh 
discloses: the structural weight module evaluating the structural weight in accordance 
with the formula: 

Swij= 1.0, if (J* SUBJECT) 

.8, if (J* HEADING) 

.7, if (J* SUMMARY) 

.5, if(J * BODY) 

.1, if (J* SIGNATURE) 
where swij comprises the structural weight for occurrence j of each such concept I (See 
page 21 lines 25-29). 



As for claim 22, the rejection of claim 19 is incorporated, and further Lindh 
discloses: the corpus weight module evaluating the corpus weight in accordance with 
the formula: 
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Rwij = ( T-rii ) A 2 , rij >M 
T 

1 .0 rij < M 

Where rwij comprises the corpus weight rij comprises a reference count for occurrence j 
of each such concept I, T comprises a total number of reference counts of documents in 
the document set, and M comprises a maximum reference count of documents in the 
document set (See page 23 lines 20-23). 

As for claim 23, the rejection of claim 19 is incorporated and further Lindh 
discloses: a compression module compressing the score in accordance with the formula 
S'l = log(Si +1), where Si comprises the compressed score for each such concept I 
(See page 27 lines 1-7). 

Claims 35-40 are method claims comprising substantially the same limitation as 
system claims 18-23, and are thus rejected for the reasons set forth in the rejection of 
claims 18-23. 

Claim 52 is rejected for substantially the same reasons as claim 35,. 
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Claim 53 is an apparatus claim corresponding to method claim 18 and is thus 
rejected for the same reasons as claim 18. 



Claims 24-27 and 41-44 and claims is rejected under 35 U.S.C. 103(a) as being 
unpatentable over Lind and Dhill as applied to claim 18 and 35 above, and further in 
view of US 6675159 (hereinafter Klein) (art of record) 

As for claim 24 the rejection of claim 18 is incorporated, and further Klein 
discloses: a global stop concept vector cache maintaining concepts and terms (See 
column 18 lines 17-20 and See column 14 lines 45-49); and a filtering module filtering 
selection of the at least one concept based on the concepts and terms maintained in the 
global stop concept vector cache (See column 14 lines 45-50). It would have been 
obvious to an artisan of ordinary skill in the pertinent art at the time of the invention to 
have incorporated the teachings of Klein into the system of Lind. The modification would 
have been obvious because queries and documents are linked in the fact that words are 
the entities that are being processed. Therefore, any transformation capable of being 
made to a query should be able to applied to documents too, this makes all document 
management systems more efficient and easier to maintain. 

As for claim 25 the rejection of claim 18 is incorporated, and further Klein 
discloses: a parsing module identifying terms within at least one document in the 
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document set, and combining the identified terms into one or more of the concepts (See 
column 2 lines 53-56). 

As for claim 26 the rejection of claim 25 is incorporated, and further Klein 
discloses: the parsing module structuring each such identified term in the one or more 
concepts into canonical concepts comprising at least one of word root, character case, 
and word ordering (See column 14 lines 63-67). 

As for claim 27 the rejection of claim 25 is incorporated, and further Klein 
discloses wherein at least one of nouns, proper nouns and adjectives are included as 

Claims 41-44, are method claims corresponding to system claims 24-27, 
respectively and are thus rejected for the same reasons as set forth in the rejection of 
claims 24-27,. 

Claims 30,31,47 and 48 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Lind and Dhill as applied to claim 29 above, and further in view of 
Klein. 

As for claim 30 the rejection of claim 29 is incorporated, and further Klein 
discloses: a normalized score vector for each document comprising the score 
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associated with the at least one concept for each such concept occurring within the 
document (See column 3 lines 18-21); and the similarity module determining the 
similarity as a function of the normalized score vector associated with the at least one 
concept for each such document (See column 18 lines 23-26). 

As for claim 31 , the rejection of claim 30 is incorporated, and further Klein 
discloses: the similarity submodule calculating the similarity in accordance with the 
formula 

coso ab = (Ss * Sb) 
Sa Sb 

Where coso ab comprises a similarity between a document A and a document B, Sa 
comprises a score vector for document A and Sa comprises a score vector for 
document B. 

Claim 48 is a method claim corresponding to the system of claim 31 respectively 
and is thus rejected for the same reasons as set forth in the rejection of claim 31 . 
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Response to Arguments 

Applicant's arguments filed 2/27/08 have been fully considered but they are not 
persuasive. 
Applicant argues: 

One skilled in the art would not be motivated to combine the teaching of Lindh 
with the teachings of Dhillon. Lindh focuses on improving a quality of a search to locate 
terms synonymous with a search term, whereas Dhi!!on focuses on improving search 
efficiency by iterative partitioning of documents. Lindh uses clustering as a method to 
reduce the number of similar documents in a document corpus, and not for a 
representation of the similar documents, Instead, Dhiilon uses clustering as a method to 
partition and display documents for use in a search. Accordingly, a teaching, 
suggestion, or motivation to combine Lindh and Dhillon has not been shown. 

Examiner responds: 

Examiner is not persuaded. In response to applicant's argument that there is no 
suggestion to combine the references, the examiner recognizes that obviousness can 
only be established by combining or modifying the teachings of the prior art to produce 
the claimed invention where there is some teaching, suggestion, or motivation to do so 
found either in the references themselves or in the knowledge generally available to one 
of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 
1988) and In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992). In this case, it 
would have been obvious to a person of ordinary skill in the art at the time the invention 
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was made to have incoproated Dhil's teaching into the system of Lind due to the need 
to reclassify documents based on the center of a cluster. 

Applicant argues: 

Moreover, modifying the teachings of Dhillon to consider dynamic data would not 
be predicable, :as Dhillon teaches a static threshold tbr determining a stopping point for 
partitioning iterations. A fixed threshold is not adaptable, and replacing the fixed 
threshold with a dynamic threshold requires implementing functionality that continually 
adapts the threshold. Dhillon neither teaches .nor suggests allowing the tlhreshold to be 
dynamically redefined. Second, a finding that there was reasonable expectation of 
success must be made. MPEP 2t43(G)(2). Claims 1, 9, 17, 18, 35, 52, and 53 have 
been read on a combination of Lindh and Dhillon, but how the combination woNd be 
reasonably expected to succeed has not been explained. "The mere fact that references 
be combined or modified does .not render the resultant combination obvious unless the 
results would have been predictable to one of ordinary skill in the art.'" MPEP 
zl4;~.01(lll) (citing KSR International Co. IT. Tel-flex Inc., 550 U.S. ~ ,82 USPQ2d 
1385, 1396 (2007)). 

Examiner responds: 

Examiner is not persuaded. Examiner notes that a prima facie case of 
obviousness is established when the teachings from the prior art itself would appear to 
have suggested the claimed subject matter to a person of ordinary skill in the art. Once 
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such a case is established, it is incumbent upon appellant to go forward with objective 
evidence of unobviousness. In re Fielder . 471 F.2d 640, 176 USPQ 300 (CCPA 1973). 

Applicant argues: 

Claim i incorporates the limitations of now-canceled dependent Claim 2, and now 
recites a scoring module determining a score, which is assigned to at least one concept 
that has been extracted from a plurality of electronically-stored documents, wherein the 
score is calculated as a function of a summation, of a frequency of occurrence of the at 
least one concept within at least one such document, a concept weight, a structural 
weight, and a corpus weight. Claim 9 has been amended to incorporate the limitations 
of now-canceled dependent Claim 10. Amended Claim 9 now recites determining a 
score, which is assigned to-at least one concept that has been extracted from a plurality 
of electronically-stored documents, wherein the score is calculated as a function of a 
summation of a frequency of occurrence of the at least one concept within at least one 
such document, a concept weight, a structural weight, and a corpus weight. Claim 17 
has also been amended to incorporate the limitations consistent with Claim 1, as 
amended. Amended Claim 17 recites code for determining a score, which is assigned to 
at least one concept that has been extracted from a plurality of electronically-stored 
documents, wherein the score is calculated as a function of a summation of a frequency 
of occurrence of the at least one concept within at least one such document, a concept 
weight, a structural weight, and a corpus weight. 
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Examiner responds: 

Examiner is not persuaded. Examiner is entitled to give claim limitations their 
broadest reasonable interpretation in light of the specification. Interpretation of Claims- 
Broadest Reasonable Interpretation: During patent examination, the pending claims 
must be 'given the broadest reasonable interpretation consistent with the specification.' 
Applicant always has the opportunity to amend the claims during prosecution and broad 
interpretation by the examiner reduces the possibility that the claim, once issued, will be 
interpreted more broadly than is justified. In re Prater, 162 USPQ 541,550-51 (CCPA 
1969). In this case the claim interpretation depends on how you define "a function of 
which determines what precisely defines the bounds of the claim with respect to the 
score. 
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Information 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Leon J. Harper whose telephone number is 571-272- 
0759. The examiner can normally be reached on 7:30AM - 4:00Pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Hosain T. Alam can be reached on 571-272-3978. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



LJH 

Leon J. Harper 
May 25, 2008 

/Hosain T Alam/ 

Supervisory Patent Examiner, Art Unit 2166 



